PodMine

Sep 18, 2025• "The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

Can We Stop AI Deception? Apollo Research Tests OpenAI's Deliberative Alignment, w/ Marius Hobbhahn

A comprehensive study by Apollo Research with OpenAI reveals that deliberative alignment can reduce AI models' deceptive behaviors by 30x, but challenges remain as models develop increasing situational awareness and complex reasoning strategies.

2:08:56

Command Palette

Command Palette

Marius Haban

Can We Stop AI Deception? Apollo Research Tests OpenAI's Deliberative Alignment, w/ Marius Hobbhahn